Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 18 de 18
Filter
1.
J Vis Exp ; (196)2023 06 23.
Article in English | MEDLINE | ID: mdl-37427934

ABSTRACT

Cell polarity is a macroscopic phenomenon established by a collection of spatially concentrated molecules and structures that culminate in the emergence of specialized domains at the subcellular level. It is associated with developing asymmetric morphological structures that underlie key biological functions such as cell division, growth, and migration. In addition, the disruption of cell polarity has been linked to tissue-related disorders such as cancer and gastric dysplasia. Current methods to evaluate the spatiotemporal dynamics of fluorescent reporters in individual polarized cells often involve manual steps to trace a midline along the cells' major axis, which is time consuming and prone to strong biases. Furthermore, although ratiometric analysis can correct the uneven distribution of reporter molecules using two fluorescence channels, background subtraction techniques are frequently arbitrary and lack statistical support. This manuscript introduces a novel computational pipeline to automate and quantify the spatiotemporal behavior of single cells using a model of cell polarity: pollen tube/root hair growth and cytosolic ion dynamics. A three-step algorithm was developed to process ratiometric images and extract a quantitative representation of intracellular dynamics and growth. The first step segments the cell from the background, producing a binary mask through a thresholding technique in the pixel intensity space. The second step traces a path through the midline of the cell through a skeletonization operation. Finally, the third step provides the processed data as a ratiometric timelapse and yields a ratiometric kymograph (i.e., a 1D spatial profile through time). Data from ratiometric images acquired with genetically encoded fluorescent reporters from growing pollen tubes were used to benchmark the method. This pipeline allows for faster, less biased, and more accurate representation of the spatiotemporal dynamics along the midline of polarized cells, thus advancing the quantitative toolkit available to investigate cell polarity. The AMEBaS Python source code is available at: https://github.com/badain/amebas.git.


Subject(s)
Cell Polarity , Software , Time-Lapse Imaging , Algorithms , Pollen Tube , Coloring Agents
2.
Sci Rep ; 13(1): 10265, 2023 06 24.
Article in English | MEDLINE | ID: mdl-37355705

ABSTRACT

Febrile seizures during early childhood are a relevant risk factor for the development of mesial temporal lobe epilepsy. Nevertheless, the molecular mechanism induced by febrile seizures that render the brain susceptible or not-susceptible to epileptogenesis remain poorly understood. Because the temporal investigation of such mechanisms in human patients is impossible, rat models of hyperthermia-induced febrile seizures have been used for that purpose. Here we conducted a temporal analysis of the transcriptomic and microRNA changes in the ventral CA3 of rats that develop (HS group) or not-develop (HNS group) seizures after hyperthermic insult on the eleventh postnatal day. The selected time intervals corresponded to acute, latent, and chronic phases of the disease. We found that the transcriptional differences between the HS and the HNS groups are related to inflammatory pathways, immune response, neurogenesis, and dendritogenesis in the latent and chronic phases. Additionally, the HNS group expressed a greater number of miRNAs (some abundantly expressed) as compared to the HS group. These results indicate that HNS rats were able to modulate their inflammatory response after insult, thus presenting better tissue repair and re-adaptation. Potential therapeutic targets, including genes, miRNAs and signaling pathways involved in epileptogenesis were identified.


Subject(s)
Hyperthermia, Induced , MicroRNAs , Seizures, Febrile , Humans , Child, Preschool , Rats , Animals , Seizures, Febrile/genetics , Transcriptome , Hippocampus , MicroRNAs/genetics , Disease Susceptibility
3.
Sci Rep ; 13(1): 898, 2023 01 17.
Article in English | MEDLINE | ID: mdl-36650374

ABSTRACT

Since the molecular mechanisms determining COVID-19 severity are not yet well understood, there is a demand for biomarkers derived from comparative transcriptome analyses of mild and severe cases, combined with patients' clinico-demographic and laboratory data. Here the transcriptomic response of human leukocytes to SARS-CoV-2 infection was investigated by focusing on the differences between mild and severe cases and between age subgroups (younger and older adults). Three transcriptional modules correlated with these traits were functionally characterized, as well as 23 differentially expressed genes (DEGs) associated to disease severity. One module, correlated with severe cases and older patients, had an overrepresentation of genes involved in innate immune response and in neutrophil activation, whereas two other modules, correlated with disease severity and younger patients, harbored genes involved in the innate immune response to viral infections, and in the regulation of this response. This transcriptomic mechanism could be related to the better outcome observed in younger COVID-19 patients. The DEGs, all hyper-expressed in the group of severe cases, were mostly involved in neutrophil activation and in the p53 pathway, therefore related to inflammation and lymphopenia. These biomarkers may be useful for getting a better stratification of risk factors in COVID-19.


Subject(s)
Age Factors , COVID-19 , Patient Acuity , Humans , Biomarkers/metabolism , COVID-19/genetics , Leukocytes/metabolism , SARS-CoV-2/metabolism , Transcriptome
4.
BMC Bioinformatics ; 16: 35, 2015 02 05.
Article in English | MEDLINE | ID: mdl-25652056

ABSTRACT

BACKGROUND: In this study, clustering was performed using a bitmap representation of HIV reverse transcriptase and protease sequences, to produce an unsupervised classification of HIV sequences. The classification will aid our understanding of the interactions between mutations and drug resistance. 10,229 HIV genomic sequences from the protease and reverse transcriptase regions of the pol gene and antiretroviral resistant related mutations represented in an 82-dimensional binary vector space were analyzed. RESULTS: A new cluster representation was proposed using an image inspired by microarray data, such that the rows in the image represented the protein sequences from the genotype data and the columns represented presence or absence of mutations in each protein position.The visualization of the clusters showed that some mutations frequently occur together and are probably related to an epistatic phenomenon. CONCLUSION: We described a methodology based on the application of a pattern recognition algorithm using binary data to suggest clusters of mutations that can easily be discriminated by cluster viewing schemes.


Subject(s)
Algorithms , Drug Resistance, Viral/genetics , HIV Protease/genetics , HIV Reverse Transcriptase/genetics , HIV-1/genetics , Mutation/genetics , Anti-HIV Agents/pharmacology , Genotype , HIV Infections/drug therapy , HIV Infections/epidemiology , HIV Infections/virology , HIV-1/drug effects , HIV-1/enzymology , Humans
5.
Gene ; 541(2): 129-37, 2014 May 15.
Article in English | MEDLINE | ID: mdl-24631265

ABSTRACT

Inference of gene regulatory networks (GRNs) is one of the most challenging research problems of Systems Biology. In this investigation, a new GRNs inference methodology, called Entropic Biological Score (EBS), which linearly combines the mean conditional entropy (MCE) from expression levels and a Biological Score (BS), obtained by integrating different biological data sources, is proposed. The EBS is validated with the Cell Cycle related functional annotation information, available from Munich Information Center for Protein Sequences (MIPS), and compared with some existing methods like MRNET, ARACNE, CLR and MCE for GRNs inference. For real networks, the performance of EBS, which uses the concept of integrating different data sources, is found to be superior to the aforementioned inference methods. The best results for EBS are obtained by considering the weights w1=0.2 and w2=0.8 for MCE and BS values, respectively, where approximately 40% of the inferred connections are found to be correct and significantly better than related methods. The results also indicate that expression profile is able to recover some true connections, that are not present in biological annotations, thus leading to the possibility of discovering new relations between its genes.


Subject(s)
Cell Cycle/genetics , Computational Biology/methods , Gene Regulatory Networks , Entropy , Gene Expression , Models, Theoretical , Phenotype , Protein Interaction Mapping
6.
BMC Genomics ; 13 Suppl 6: S7, 2012.
Article in English | MEDLINE | ID: mdl-23134775

ABSTRACT

BACKGROUND: A current challenge in gene annotation is to define the gene function in the context of the network of relationships instead of using single genes. The inference of gene networks (GNs) has emerged as an approach to better understand the biology of the system and to study how several components of this network interact with each other and keep their functions stable. However, in general there is no sufficient data to accurately recover the GNs from their expression levels leading to the curse of dimensionality, in which the number of variables is higher than samples. One way to mitigate this problem is to integrate biological data instead of using only the expression profiles in the inference process. Nowadays, the use of several biological information in inference methods had a significant increase in order to better recover the connections between genes and reduce the false positives. What makes this strategy so interesting is the possibility of confirming the known connections through the included biological data, and the possibility of discovering new relationships between genes when observed the expression data. Although several works in data integration have increased the performance of the network inference methods, the real contribution of adding each type of biological information in the obtained improvement is not clear. METHODS: We propose a methodology to include biological information into an inference algorithm in order to assess its prediction gain by using biological information and expression profile together. We also evaluated and compared the gain of adding four types of biological information: (a) protein-protein interaction, (b) Rosetta stone fusion proteins, (c) KEGG and (d) KEGG+GO. RESULTS AND CONCLUSIONS: This work presents a first comparison of the gain in the use of prior biological information in the inference of GNs by considering the eukaryote (P. falciparum) organism. Our results indicates that information based on direct interaction can produce a higher improvement in the gain than data about a less specific relationship as GO or KEGG. Also, as expected, the results show that the use of biological information is a very important approach for the improvement of the inference. We also compared the gain in the inference of the global network and only the hubs. The results indicates that the use of biological information can improve the identification of the most connected proteins.


Subject(s)
Gene Regulatory Networks , Algorithms , Databases, Genetic , Genome , Plasmodium falciparum/genetics , Protein Interaction Maps , Proteins/metabolism
7.
BMC Syst Biol ; 5: 61, 2011 May 05.
Article in English | MEDLINE | ID: mdl-21545720

ABSTRACT

BACKGROUND: The inference of gene regulatory networks (GRNs) from large-scale expression profiles is one of the most challenging problems of Systems Biology nowadays. Many techniques and models have been proposed for this task. However, it is not generally possible to recover the original topology with great accuracy, mainly due to the short time series data in face of the high complexity of the networks and the intrinsic noise of the expression measurements. In order to improve the accuracy of GRNs inference methods based on entropy (mutual information), a new criterion function is here proposed. RESULTS: In this paper we introduce the use of generalized entropy proposed by Tsallis, for the inference of GRNs from time series expression profiles. The inference process is based on a feature selection approach and the conditional entropy is applied as criterion function. In order to assess the proposed methodology, the algorithm is applied to recover the network topology from temporal expressions generated by an artificial gene network (AGN) model as well as from the DREAM challenge. The adopted AGN is based on theoretical models of complex networks and its gene transference function is obtained from random drawing on the set of possible Boolean functions, thus creating its dynamics. On the other hand, DREAM time series data presents variation of network size and its topologies are based on real networks. The dynamics are generated by continuous differential equations with noise and perturbation. By adopting both data sources, it is possible to estimate the average quality of the inference with respect to different network topologies, transfer functions and network sizes. CONCLUSIONS: A remarkable improvement of accuracy was observed in the experimental results by reducing the number of false connections in the inferred topology by the non-Shannon entropy. The obtained best free parameter of the Tsallis entropy was on average in the range 2.5 ≤ q ≤ 3.5 (hence, subextensive entropy), which opens new perspectives for GRNs inference methods based on information theory and for investigation of the nonextensivity of such networks. The inference algorithm and criterion function proposed here were implemented and included in the DimReduction software, which is freely available at http://sourceforge.net/projects/dimreduction and http://code.google.com/p/dimreduction/.


Subject(s)
Computational Biology/methods , Entropy , Gene Regulatory Networks , Models, Genetic , Time Factors
8.
J Comput Biol ; 18(10): 1353-67, 2011 Oct.
Article in English | MEDLINE | ID: mdl-21548810

ABSTRACT

Thanks to recent advances in molecular biology, allied to an ever increasing amount of experimental data, the functional state of thousands of genes can now be extracted simultaneously by using methods such as cDNA microarrays and RNA-Seq. Particularly important related investigations are the modeling and identification of gene regulatory networks from expression data sets. Such a knowledge is fundamental for many applications, such as disease treatment, therapeutic intervention strategies and drugs design, as well as for planning high-throughput new experiments. Methods have been developed for gene networks modeling and identification from expression profiles. However, an important open problem regards how to validate such approaches and its results. This work presents an objective approach for validation of gene network modeling and identification which comprises the following three main aspects: (1) Artificial Gene Networks (AGNs) model generation through theoretical models of complex networks, which is used to simulate temporal expression data; (2) a computational method for gene network identification from the simulated data, which is founded on a feature selection approach where a target gene is fixed and the expression profile is observed for all other genes in order to identify a relevant subset of predictors; and (3) validation of the identified AGN-based network through comparison with the original network. The proposed framework allows several types of AGNs to be generated and used in order to simulate temporal expression data. The results of the network identification method can then be compared to the original network in order to estimate its properties and accuracy. Some of the most important theoretical models of complex networks have been assessed: the uniformly-random Erdös-Rényi (ER), the small-world Watts-Strogatz (WS), the scale-free Barabási-Albert (BA), and geographical networks (GG). The experimental results indicate that the inference method was sensitive to average degree variation, decreasing its network recovery rate with the increase of . The signal size was important for the inference method to get better accuracy in the network identification rate, presenting very good results with small expression profiles. However, the adopted inference method was not sensible to recognize distinct structures of interaction among genes, presenting a similar behavior when applied to different network topologies. In summary, the proposed framework, though simple, was adequate for the validation of the inferred networks by identifying some properties of the evaluated method, which can be extended to other inference methods.


Subject(s)
Computational Biology/methods , Gene Regulatory Networks/genetics , Models, Genetic , Software Validation , Systems Biology/methods , Algorithms , Artificial Intelligence , Computer Simulation , Gene Expression , Synthetic Biology , Time Factors
9.
BMC Bioinformatics ; 9: 451, 2008 Oct 22.
Article in English | MEDLINE | ID: mdl-18945362

ABSTRACT

BACKGROUND: Feature selection is a pattern recognition approach to choose important variables according to some criteria in order to distinguish or explain certain phenomena (i.e., for dimensionality reduction). There are many genomic and proteomic applications that rely on feature selection to answer questions such as selecting signature genes which are informative about some biological state, e.g., normal tissues and several types of cancer; or inferring a prediction network among elements such as genes, proteins and external stimuli. In these applications, a recurrent problem is the lack of samples to perform an adequate estimate of the joint probabilities between element states. A myriad of feature selection algorithms and criterion functions have been proposed, although it is difficult to point the best solution for each application. RESULTS: The intent of this work is to provide an open-source multiplatform graphical environment for bioinformatics problems, which supports many feature selection algorithms, criterion functions and graphic visualization tools such as scatterplots, parallel coordinates and graphs. A feature selection approach for growing genetic networks from seed genes (targets or predictors) is also implemented in the system. CONCLUSION: The proposed feature selection environment allows data analysis using several algorithms, criterion functions and graphic visualization tools. Our experiments have shown the software effectiveness in two distinct types of biological problems. Besides, the environment can be used in different pattern recognition applications, although the main concern regards bioinformatics tasks.


Subject(s)
Computational Biology/methods , Genomics/methods , Pattern Recognition, Automated/methods , Software , Algorithms , Bayes Theorem , Data Interpretation, Statistical , Internet , Markov Chains , Models, Genetic , Reproducibility of Results , User-Computer Interface
10.
Article in English | MEDLINE | ID: mdl-18451429

ABSTRACT

An important topic in genomic sequence analysis is the identification of protein coding regions. In this context, several coding DNA model-independent methods, based on the occurrence of specific patterns of nucleotides at coding regions, have been proposed. Nonetheless, these methods have not been completely suitable due to their dependence on an empirically pre-defined window length required for a local analysis of a DNA region. We introduce a method, based on a modified Gabor-wavelet transform (MGWT), for the identification of protein coding regions. This novel transform is tuned to analyze periodic signal components and presents the advantage of being independent of the window length. We compared the performance of the MGWT with other methods using eukaryote datasets. The results show that the MGWT outperforms all assessed model-independent methods with respect to identification accuracy. These results indicate that the source of at least part of the identification errors produced by the previous methods is the fixed working scale. The new method not only avoids this source of errors, but also makes available a tool for detailed exploration of the nucleotide occurrence.


Subject(s)
DNA/genetics , Proteins/genetics , Sequence Analysis, DNA/statistics & numerical data , Computational Biology , Databases, Nucleic Acid , Databases, Protein , Globins/genetics , Humans , Models, Statistical , Pattern Recognition, Automated , Signal Processing, Computer-Assisted
11.
Clin Ophthalmol ; 2(1): 109-22, 2008 Mar.
Article in English | MEDLINE | ID: mdl-19668394

ABSTRACT

Timely intervention for diabetic retinopathy (DR) lessens the possibility of blindness and can save considerable costs to health systems. To ensure that interventions are timely and effective requires methods of screening and monitoring pathological changes, including assessing outcomes. Fractal analysis, one method that has been studied for assessing DR, is potentially relevant in today's world of telemedicine because it provides objective indices from digital images of complex patterns such as are seen in retinal vasculature, which is affected in DR. We introduce here a protocol to distinguish between nonproliferative (NPDR) and proliferative (PDR) changes in retinal vasculature using a fractal analysis method known as local connected dimension (D(conn)) analysis. The major finding is that compared to other fractal analysis methods, D(conn) analysis better differentiates NPDR from PDR (p = 0.05). In addition, we are the first to show that fractal analysis can be used to differentiate between NPDR and PDR using automated vessel identification. Overall, our results suggest this protocol can complement existing methods by including an automated and objective measure obtainable at a lower level of expertise that experts can then use in screening for and monitoring DR.

12.
BMC Bioinformatics ; 8: 169, 2007 May 22.
Article in English | MEDLINE | ID: mdl-17519038

ABSTRACT

BACKGROUND: One goal of gene expression profiling is to identify signature genes that robustly distinguish different types or grades of tumors. Several tumor classifiers based on expression profiling have been proposed using microarray technique. Due to important differences in the probabilistic models of microarray and SAGE technologies, it is important to develop suitable techniques to select specific genes from SAGE measurements. RESULTS: A new framework to select specific genes that distinguish different biological states based on the analysis of SAGE data is proposed. The new framework applies the bolstered error for the identification of strong genes that separate the biological states in a feature space defined by the gene expression of a training set. Credibility intervals defined from a probabilistic model of SAGE measurements are used to identify the genes that distinguish the different states with more reliability among all gene groups selected by the strong genes method. A score taking into account the credibility and the bolstered error values in order to rank the groups of considered genes is proposed. Results obtained using SAGE data from gliomas are presented, thus corroborating the introduced methodology. CONCLUSION: The model representing counting data, such as SAGE, provides additional statistical information that allows a more robust analysis. The additional statistical information provided by the probabilistic model is incorporated in the methodology described in the paper. The introduced method is suitable to identify signature genes that lead to a good separation of the biological states using SAGE and may be adapted for other counting methods such as Massive Parallel Signature Sequencing (MPSS) or the recent Sequencing-By-Synthesis (SBS) technique. Some of such genes identified by the proposed method may be useful to generate classifiers.


Subject(s)
Computational Biology/methods , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Oligonucleotide Array Sequence Analysis , Astrocytoma/genetics , Astrocytoma/pathology , Brain/metabolism , Gene Library , Glioblastoma/genetics , Humans , Models, Statistical
13.
J Opt Soc Am A Opt Image Sci Vis ; 24(5): 1448-56, 2007 May.
Article in English | MEDLINE | ID: mdl-17429492

ABSTRACT

Proliferative diabetic retinopathy can lead to blindness. However, early recognition allows appropriate, timely intervention. Fluorescein-labeled retinal blood vessels of 27 digital images were automatically segmented using the Gabor wavelet transform and classified using traditional features such as area, perimeter, and an additional five morphological features based on the derivatives-of-Gaussian wavelet-derived data. Discriminant analysis indicated that traditional features do not detect early proliferative retinopathy. The best single feature for discrimination was the wavelet curvature with an area under the curve (AUC) of 0.76. Linear discriminant analysis with a selection of six features achieved an AUC of 0.90 (0.73-0.97, 95% confidence interval). The wavelet method was able to segment retinal blood vessels and classify the images according to the presence or absence of proliferative retinopathy.


Subject(s)
Algorithms , Artificial Intelligence , Diabetic Retinopathy/pathology , Fluorescein Angiography/methods , Image Interpretation, Computer-Assisted/methods , Pattern Recognition, Automated/methods , Retinal Vessels/pathology , Humans , Image Enhancement/methods , Reproducibility of Results , Sensitivity and Specificity
14.
IEEE Trans Syst Man Cybern B Cybern ; 36(2): 312-27, 2006 Apr.
Article in English | MEDLINE | ID: mdl-16602592

ABSTRACT

The spatial relation "between" is a notion which is intrinsically both fuzzy and contextual, and depends, in particular, on the shape of the objects. The literature is quite poor on this and the few existing definitions do not take into account these aspects. In particular, an object B that is in a concavity of an object A1 not visible from an object A2 is considered between A1 and A2 for most definitions, which is counter intuitive. Also, none of the definitions deal with cases where one object is much more elongated than the other. Here, we propose definitions which are based on convexity, morphological operators, and separation tools, and a fuzzy notion of visibility. They correspond to the main intuitive exceptions of the relation. We distinguish between cases where objects have similar spatial extensions and cases where one object is much more extended than the other. Extensions to cases where objects, themselves, are fuzzy and to three-dimensional space are proposed as well. The original work proposed in this paper covers the main classes of situations and overcomes the limits of existing approaches, particularly concerning nonvisible concavities and extended objects. Moreover, the definitions capture the intrinsic imprecision attached to this relation. The main proposed definitions are illustrated on real data from medical images.


Subject(s)
Algorithms , Artificial Intelligence , Image Interpretation, Computer-Assisted/methods , Imaging, Three-Dimensional/methods , Information Storage and Retrieval/methods , Pattern Recognition, Automated/methods
15.
Comput Biol Med ; 34(5): 427-47, 2004 Jul.
Article in English | MEDLINE | ID: mdl-15145713

ABSTRACT

This paper describes a data mining environment for knowledge discovery in bioinformatics applications. The system has a generic kernel that implements the mining functions to be applied to input primary databases, with a warehouse architecture, of biomedical information. Both supervised and unsupervised classification can be implemented within the kernel and applied to data extracted from the primary database, with the results being suitably stored in a complex object database for knowledge discovery. The kernel also includes a specific high-performance library that allows designing and applying the mining functions in parallel machines. The experimental results obtained by the application of the kernel functions are reported.


Subject(s)
Computational Biology , Knowledge , Databases as Topic , Gene Expression Profiling , Systems Integration
16.
J Integr Neurosci ; 3(4): 415-32, 2004 Dec.
Article in English | MEDLINE | ID: mdl-15657977

ABSTRACT

Computational morphological analysis comprises the development of measures (indicators) that describe different form attributes of a neuron and provides additional parameters for classification algorithms. Our work addressed the problem of small group sizes often encountered in neuromorphological and neurophysiological research, automated classification tasks (unsupervised learning) and introduced a new morphological measure: the wavelet statistical moment. We analysed cat alpha/Y, beta/X and delta Golgi-stained retinal ganglion cells using six different shape features (circularity, 2(nd) statistical moment and entropy of Gaussian blurred images, wavelet statistical moment, number of terminations and the fractal dimension). This allowed us to compare the sensitivity of the methods in uniquely describing morphological attributes of these cells.


Subject(s)
Algorithms , Cluster Analysis , Retinal Ganglion Cells/cytology , Animals , Cats , Dendrites/physiology , Retinal Ganglion Cells/physiology
17.
Biochim Biophys Acta ; 1619(1): 98-112, 2003 Jan 02.
Article in English | MEDLINE | ID: mdl-12495820

ABSTRACT

Leiomyoma is a benign smooth muscle tumor of the uterus that affects many women in active reproductive life. It is composed by bundles of smooth muscle cells surrounded by extracellular matrix. We have recently shown that the glycosylation of extracellular matrix proteoglycans is modified in leiomyoma: increased amounts of galactosaminoglycans with structural modifications are present. The data here presented show that decorin is present in both normal myometrium and leiomyoma but tumoral decorin is glycosylated with longer galactosaminoglycan side chains. Furthermore, these chains contain a higher ratio D-glucuronate/L-iduronate, as compared to normal tissue. To determine if these changes in proteoglycan glycosylation correlates with modifications in the extracellular matrix organization, we compared the general structural architecture of leiomyoma to normal myometrium. By histochemical and immunofluorescence methods, we found a reorganization of muscle fibers and extracellular matrix, with changes in the distribution of glycoproteins, proteoglycans, and collagen. Thin reticular fibers, possibly composed by types I and III collagen, were replaced by thick fibers, possibly richer in type I collagen. Type I collagen colocalized with decorin both in leiomyoma and normal myometrium, in contrast to type IV collagen that did not. The relative amount of decorin was increased and the distribution of decorin and collagen was totally modified in the tumor, as compared to the normal myometrium. These findings reveal that not only decorin structure is modified in leiomyoma but also the tissue architecture changed, especially concerning extracellular matrix.


Subject(s)
Leiomyoma/metabolism , Myometrium/metabolism , Proteoglycans/metabolism , Uterine Neoplasms/metabolism , Adult , Amino Acid Sequence , Chromatography, Gel , Chromatography, Ion Exchange , Decorin , Electrophoresis, Agar Gel , Extracellular Matrix Proteins , Female , Glycosylation , Humans , Microscopy, Fluorescence , Middle Aged , Molecular Sequence Data , Protein Conformation , Proteoglycans/chemistry , Proteoglycans/isolation & purification , Sequence Homology, Amino Acid
18.
J Comput Biol ; 9(1): 105-26, 2002.
Article in English | MEDLINE | ID: mdl-11911797

ABSTRACT

There are many algorithms to cluster sample data points based on nearness or a similarity measure. Often the implication is that points in different clusters come from different underlying classes, whereas those in the same cluster come from the same class. Stochastically, the underlying classes represent different random processes. The inference is that clusters represent a partition of the sample points according to which process they belong. This paper discusses a model-based clustering toolbox that evaluates cluster accuracy. Each random process is modeled as its mean plus independent noise, sample points are generated, the points are clustered, and the clustering error is the number of points clustered incorrectly according to the generating random processes. Various clustering algorithms are evaluated based on process variance and the key issue of the rate at which algorithmic performance improves with increasing numbers of experimental replications. The model means can be selected by hand to test the separability of expected types of biological expression patterns. Alternatively, the model can be seeded by real data to test the expected precision of that output or the extent of improvement in precision that replication could provide. In the latter case, a clustering algorithm is used to form clusters, and the model is seeded with the means and variances of these clusters. Other algorithms are then tested relative to the seeding algorithm. Results are averaged over various seeds. Output includes error tables and graphs, confusion matrices, principal-component plots, and validation measures. Five algorithms are studied in detail: K-means, fuzzy C-means, self-organizing maps, hierarchical Euclidean-distance-based and correlation-based clustering. The toolbox is applied to gene-expression clustering based on cDNA microarrays using real data. Expression profile graphics are generated and error analysis is displayed within the context of these profile graphics. A large amount of generated output is available over the web.


Subject(s)
Oligonucleotide Array Sequence Analysis/methods , Computational Biology , DNA Fingerprinting , Gene Expression Regulation , Genetic Markers , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...